2 research outputs found

    Automatic Understanding of ATC Speech: Study of Prospectives and Field Experiments for Several Controller Positions

    Get PDF
    Although there has been a lot of interest in recognizing and understanding air traffic control (ATC) speech, none of the published works have obtained detailed field data results. We have developed a system able to identify the language spoken and recognize and understand sentences in both Spanish and English. We also present field results for several in-tower controller positions. To the best of our knowledge, this is the first time that field ATC speech (not simulated) is captured, processed, and analyzed. The use of stochastic grammars allows variations in the standard phraseology that appear in field data. The robust understanding algorithm developed has 95% concept accuracy from ATC text input. It also allows changes in the presentation order of the concepts and the correction of errors created by the speech recognition engine improving it by 17% and 25%, respectively, absolute in the percentage of fully correctly understood sentences for English and Spanish in relation to the percentages of fully correctly recognized sentences. The analysis of errors due to the spontaneity of the speech and its comparison to read speech is also carried out. A 96% word accuracy for read speech is reduced to 86% word accuracy for field ATC data for Spanish for the "clearances" task confirming that field data is needed to estimate the performance of a system. A literature review and a critical discussion on the possibilities of speech recognition and understanding technology applied to ATC speech are also given

    Implementation of Dialog Applications in an Open-Source VoiceXML Platform

    No full text
    In this paper, we study the approach followed to use the VoiceXML standard in a dialog system platform already available in our group. As VoiceXML interpreter we have chosen OpenVXI, an open source portable solution where we can make the modifications needed to adapt the solution to the characteristics of our recognition and synthesis modules; so we will emphasize the changes that we have had to make in such interpreter. Besides, we review some relevant modules in our platform and their capabilities, highlighting the use of standards in them, as SSML for the text-to-speech system and JSGF for the specification of grammars for recognition. Finally, we discuss several ideas regarding the limitations detected in VoiceXML
    corecore